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SARS coronavirus main protease (SARS-CoV M pro ) is essential for the replication of the virus and regarded 
as a major antiviral drug target. The enzyme is a cysteine protease, with a catalytic dyad (Cys-145/His-41) 
in the active site. Aldehyde inhibitors can bind reversibly to the active-site sulfhydryl of SARS-CoV M pro . 
Previous studies using peptidic substrates and inhibitors showed that the substrate specificity of SARS- 
CoV M pro requires glutamine in the PI position and a large hydrophobic residue in the P2 position. We 
determined four crystal structures of SARS-CoV M pro in complex with pentapeptide aldehydes (Ac-EST- 
LQ-H, Ac-NSFSQ-H, Ac-DSFDQ-H, and Ac-NSTSQ-H). Kinetic data showed that all of these aldehydes exhi¬ 
bit inhibitory activity towards SARS-CoV M pro , with K, values in the pM range. Surprisingly, the X-ray 
structures revealed that the hydrophobic S2 pocket of the enzyme can accommodate serine and even 
aspartic-acid side-chains in the P2 positions of the inhibitors. Consequently, we reassessed the substrate 
specificity of the enzyme by testing the cleavage of 20 different tetradecapeptide substrates with varying 
amino-acid residues in the P2 position. The cleavage efficiency for the substrate with serine in the P2 
position was 160-times lower than that for the original substrate (P2 = Leu); furthermore, the substrate 
with aspartic acid in the P2 position was not cleaved at all. We also determined a crystal structure of 
SARS-CoV M pro in complex with aldehyde Cm-FF-H, which has its PI-phenylalanine residue bound to 
the relatively hydrophilic SI pocket of the enzyme and yet exhibits a high inhibitory activity against 
SARS-CoV M pro , with K, = 2.24 ± 0.58 pM. These results show that the stringent substrate specificity of 
the SARS-CoV M pro with respect to the PI and P2 positions can be overruled by the highly electrophilic 
character of the aldehyde warhead, thereby constituting a deviation from the dogma that peptidic inhib¬ 
itors need to correspond to the observed cleavage specificity of the target protease. 

© 2011 Elsevier B.V. All rights reserved. 
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1. Introduction 

SARS coronavirus main protease (SARS-CoV M pro ) is essential 
for the replication of the virus and regarded as a major antiviral 
drug target (Anand et al., 2003; Yang et al„ 2003; Tan et al., 
2005; Steuber and Hilgenfeld, 2010). Peptide aldehyde inhibitors 
are widely used as research tools to characterize the substrate 
specificity of serine and cysteine proteases. In complex with the 
crystalline target enzyme, they provide a wealth of information 
on the specificity-defining subsites of the substrate-binding site. 
For this purpose, we have previously described peptide aldehydes 
as inhibitors of the SARS-coronavirus main protease (SARS-CoV 
M pro ) (Al-Gharabli et al., 2006). A major advantage of aldehydes 
over other peptide electrophiles is their reversible binding to the 
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active-site sulfhydryl of cysteine proteases. We have previously 
used this property to screen a small library of non-peptide nucleo¬ 
philes (mostly amines) for molecules that would efficiently en¬ 
hance the inhibition of the aldehyde by formation of a stronger 
binding aldehyde-nucleophile ligation product. In a second step 
of this process that we named “Dynamic Ligation Screening” 
(Schmidt et al., 2008), the most efficient compound in this process 
had its amino group replaced by an aldehyde moiety and was re¬ 
acted with the target enzyme, and the resulting covalent adduct 
was subsequently used to screen the same library of nucleophiles 
for those that would react with the aldehyde. This way, starting 
from a substrate-like peptide aldehyde, we obtained a low-pM 
non-peptidic, reversible inhibitor from two fragments which by 
themselves had no or only very low inhibitory activity (Schmidt 
et al., 2008). 

Interestingly, it turned out that some peptide aldehydes with an 
amino-acid sequence deviating from the consensus sequence 
embracing the cleavage sites of polyprotein substrates were sur¬ 
prisingly efficient as inhibitors. In particular, peptide aldehydes 
with P2 = Asp or Ser inhibited the main protease with a relatively 
low IC 5 o (Al-Gharabli et al., 2006), although it is generally agreed 
that the S2 specificity subsite of the enzyme has a strong prefer¬ 
ence for large hydrophobic side-chains such as Leu, Phe, or Met 
(Fan et al„ 2004, 2005; Lai et al., 2006). We therefore assumed that 
the hydrophilic P2 side-chain of these inhibitors would be oriented 
towards the solvent, rather than occupy the S2 pocket, and that the 
hydrophobic P3 residue would be binding to that pocket 
(Al-Gharabli et al„ 2006; Schmidt et al., 2008). This model was 
based on a crystal structure of a peptidyl chloromethyl ketone 
bound to the SARS-CoV main protease, in which this arrangement 
had been observed (Yang et al., 2003). Here we show by X-ray crys¬ 
tallography of a number of peptide aldehyde complexes with the 
protein that this is not the case. Instead, the hydrophilic Ser or 
Asp residues in the P2 position bind to the hydrophobic S2 subsite. 
In order to understand these binding modes, the atomic interac¬ 
tions were analyzed in detail. Also, we report the inhibition kinet¬ 
ics of these compounds and re-evaluate the cleavage specificity of 
the enzyme as far as the P2 position is concerned. In addition, we 
determined a crystal structure of SARS-CoV M pro in complex with 
the aldehyde Cm-FF-H (cinnamoyl-Phe-Phe-H), which has a phen¬ 
ylalanine residue in the PI position and exhibits high inhibitory 
activity against SARS-CoV M pro , with K, = 2.24 ± 0.58 pM. These re¬ 
sults show that the stringent substrate specificity of the SARS-CoV 
M pro with respect to the PI and P2 positions can be overruled by 
the highly electrophilic character of the aldehyde warhead. 


2. Materials and methods 

2.1. Synthesis of aldehydes with various P2 residues 

Chemical synthesis of peptide aldehydes Ac-ESTLQ-H, 
Ac-NSFSQ-H, Ac-DSFDQ-H, and Ac-NSTSQ-H was performed 
employing a solid-state method (Al-Gharabli et al„ 2006). Briefly, 
protected glutamine aldehyde obtained by racemization-free oxi¬ 
dation of the corresponding amino alcohol with Dess-Martin peri- 
odinane was immobilized on a threonyl resin as oxazolidine. 
Following N-tert-butyl oxycarbonyl (Boc)-protection of the ring 
nitrogen to yield the N-protected oxazolidine linker, peptide syn¬ 
thesis was performed on the resin. 


2.2. Synthesis of Cm-FF-H (Scheme 1) 

Synthesis of Cm-FF-H was carried out by amidation of 
Boc-L-phenylalanine with L-phenylalanine methyl ester followed 
by deprotection of the Boc group and acylation of the corresponding 


product with cinnamoyl chloride to provide N-cinnamoyl-L-phenyl- 
alanyl-L-phenylalanine methyl ester. Cm-FF-H was then obtained 
by reduction of the ester with NaBH 4 /CaCl 2 and alcohol oxidation 
with IBX. 


2.3. Enzyme kinetics 


The recombinant production and purification of SARS-CoV M pro 
with authentic N and C termini were performed as described pre¬ 
viously (Xue et al., 2007; Verschueren et al., 2008). The substrate 
Dabcyl-KTSAVLQJ.SGFRKME-(Edans)-amide (95% purity; Biosyntan 
GmbH, Berlin, Germany), which contains a main-protease cleavage 
site (indicated by the arrow), was used as the substrate in the fluo¬ 
rescence resonance energy transfer (FRET)-based cleavage assay. 
The substrate stock was prepared by dissolving 1 mM of the pep¬ 
tide in DMSO. The dequenching of the Dabcyl fluorescence due to 
the cleavage of the substrate as catalyzed by the SARS-CoV M pro 
was monitored at 490 nm with excitation at 340 nm, using a Cary 
Eclipse fluorescence spectrophotometer. The experiments were 
performed in the buffer consisting of 20 mM Tris-HCl (pH 7.3), 
100 mM NaCl, and 1 mM EDTA. The reaction was initiated by add¬ 
ing different final concentrations of the FRET peptide (10-50 pM) 
to a solution containing SARS-CoV M pro (final concentration 
0.5 pM). Kinetic constants (V max and K m ) were derived by fitting 
the data to the Michaelis-Menten equation, V = V max x [S]/ 
(/<m + [S]). Then k cat was calculated according to the equation, 
fc C at = k'max/IE]. In order to estimate the population of catalytically 
active M pro dimers, the respective monomer and dimer concentra¬ 
tions were calculated according to Graziano et al. (2006). The 
dimerization of the SARS-CoV M pro follows the scheme 

M + JVfeD 

The equilibrium dissociation constant, I< D , is defined by 


K n = 


[Ml 

[D] ’ 


where [M] and [D] are the molar concentrations of the monomer 
and dimer, respectively. The total protein concentration [M r ] ex¬ 
pressed in terms of molar monomer equivalents is 

[1W] t = [M] + 2 [D], 

Since 


[D] = 




substituting the above equation into the expression for the I< D yields 


I<d = 


2[JVf] 2 


[M] r - [Ab¬ 
solving the above quadratic equation for [M] gives 


[M] = 


~K d + ^K 2 d + 8[JW] t /C d 


It then follows that 


[D] = 


K d + 4[M]j - \Jl( 2 D + 8[M] t /C d 


2.4. Aldehyde inhibition assay 

Aldehyde stock solutions were prepared by dissolving the com¬ 
pounds in DMSO at 1 mM. All aldehydes form covalent bonds be¬ 
tween the aldehyde (-CHO) group of the inhibitor and the 
sulfhydryl (-SH) group of Cysl45 of SARS-CoV M pro . The binding 
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Cm-FF-H 

Scheme 1 . Synthesis of Cm-FF-H. Reagents and conditions: (a) EDCI, HOBt, Boc-Phe-OH, DMF, 0-25 °C, 18 h, 78%; (b) (i) TFA, CH 2 C1 2 , 0-25 °C, 2 h; (ii) cinnamoyl chloride, N- 
methylmorpholine, THF, DMF, 0-25 °C, 12 h, 50%; (c) NaBH 4 , CaCl 2 , EtOH, THE (1:1), 0-25 °C, 14 h, 35%; (d) IBX, DMSO, 0-25 °C, 5 h, 47%. 


of these inhibitors is reversible (Schmidt et al., 2008), although it is 
strong. Therefore, they were treated as reversible tight-binding 
inhibitors. For the measurement of the inhibition constant /<;, 
SARS-CoV M pro was incubated with the aldehyde inhibitor in reac¬ 
tion buffer at room temperature for 30 min. Then the FRET peptide 
substrate, Dabcyl-KTSAVLQJSGFRKME-Edans, was added and the 
reaction velocity was calculated according to substrate cleavage. 
Values of the intrinsic (V°) and apparent (V) lpp , /Cf pp ) catalytic 
parameters for SARS-CoV M pro catalyzing the hydrolysis of peptide 
substrate were determined in the absence and presence of 
aldehyde, respectively. The apparent inhibition constants (K) lpp ) 
for aldehyde binding to SARS-CoV M pro were obtained from the 
dependence of Vj lpp on the inhibitor concentration ([/]) at fixed 
substrate concentration ([S]), according to the equation V* pp = 
Vj lpp x [/])//<Tf pp + [/]) (Ascenzi et al., 1987; Copeland, 2000). Values 
of the intrinsic inhibition constant (1C,) for aldehyde binding to 
SARS-CoV M pro were calculated according to the equation K* pp = K, 
x (1 + [S]/lC m ) (Ascenzi et al„ 1987; Copeland, 2000). 

2.5. Peptide cleavage assay 

Twenty peptide substrates harboring alternative amino-acid 
residues in P2 (=X; SWTSAVXQJSGFRKWA) were purchased from 
GL Biochemistry Ltd. (Shanghai, China). These peptides correspond 
to the N-terminal autocleavage site of the SARS-CoV Mpro, with 
the exception of the P7 lie which had been replaced by Trp, and 
the P6' Met which had also been replaced by Trp. All peptide sub¬ 
strate stocks were dissolved in DMSO at 10 mM. To determine the 
kcat/Km for the substrate, 0.2 mM substrate peptide was incubated 
with 10 pM SARS-CoV Mpro in 40 mM Tris-HCl buffer, pH 7.3. Ali¬ 
quots of reactions were removed at different times, stopped by the 
addition of 1% trichloroacetic acid. Separation of products and sub¬ 
strate was carried out using a reverse-phase (RP) HPLC and a linear 
gradient (1-90%) of acetonitrile in 0.1% trifluoroacetic acid. The 
absorbance was determined at 280 nm, and peak areas were calcu¬ 
lated by integration. The (k cat /K m ) lpp ratio was determined by plot¬ 
ting the substrate peak area using the equation InPA = C - (fc cat / 
Km)app x C E x f, where PA is the peak area of the substrate peptide, 
C E is the total concentration of SARS-CoV M pro , and C is an experi¬ 
mental constant (Fan et al., 2004, 2005). 

2.6. Crystallization of the complexes 

SARS-CoV M pl ° with authentic chain termini was concentrated 
to 10 mg/ml and crystallized by vapor diffusion using sitting drops 
(Xue et al., 2007). The crystals grew overnight at 20 °C by equilibra¬ 
tion against a reservoir containing 6-8% polyethylene glycol (PEG) 
6000, 0.1 M MES (pH 6.0), 3% 2-methyl-2,4-pentanediol (MPD), 
and 3% DMSO. All aldehydes were dissolved in 8% PEG 6000, 0.1 
M MES (pH 6.0), 3% MPD, and 10% DMSO to a concentration of 


10 mM. Crystals of the aldehyde complexes of SARS-CoV M pro were 
obtained either by adding a 4-pl aliquot of aldehyde solution to the 
drop and soaking of the crystals for 12 h, or by incubating the 
enzyme for 2 h at 20 °C with a 7-fold excess of the aldehyde solu¬ 
tion and subsequent cocrystallizing at 20 °C against a reservoir 
containing 8% PEG 6000, 0.1 M MES (pH 6.0), 3% MPD, and 3% 
DMSO. In the latter case, nucleation was initiated by microseeding 
using crushed monoclinic (space group C2) crystals of the SARS- 
CoV M pro . Prior to diffraction data collection, the aldehyde complex 
crystals obtained from soaking and cocrystallization were trans¬ 
ferred for a few seconds to a cryoprotectant solution containing 
the crystallization ingredients and 20% MPD. 

2.7. Crystallographic data collection and processing, and structure 
elucidation and refinement 

All diffraction data were collected at 100 K at the Joint EMBL/ 
University of Hamburg/University of Liibeck synchrotron beamline 
XI3 at DESY (Hamburg, Germany), using a 165-mm MAR CCD 
detector (Mar Research, Hamburg, Germany), or at synchrotron 
beamline BL14.1 at BESSY (Berlin, Germany), using an MX225 
CCD detector (Rayonics, Evanston, IL). Statistics of data collection, 
processing, and refinement are summarized in Table 1. Data were 
processed with MOSFLM (Leslie, 1992), and scaled using the SCALA 
program from the CCP4 suite (Collaborative Computational Project, 
1994; Evans, 2006). 

Structure elucidation and refinement were carried out using the 
CCP4 suite of programs (Collaborative Computational Project, 
1994; Potterton et al„ 2003). Crystal structures were determined 
by molecular replacement, using the original X-ray structure of 
the SARS-CoV M pro with authentic chain termini (PDB ID: 2H2Z) 
(Xue et al., 2007) as the initial model. REFMAC (Murshudov et al., 
1997) was employed for structure refinement. The computer 
graphics program Coot (Emsley and Cowtan, 2004) was used for 
interpretation of electron density maps and model building. Gap 
volumes between inhibitor and protein in the subsites of the 
enzyme were calculated using UCSF-Chimera (Pettersen et al., 
2004). The molecular graphics package PyMOL (DeLano, 2002) 
was used to generate the figures. 

3. Results and discussion 

3.1. Inhibition of SARS-CoV M pro by peptide aldehydes 

The kinetic parameters of SARS-CoV M pro with authentic chain 
termini were determined using the FRET substrate Dabcyl- 
KTSAVLQjSGFRKME-Edans amide. The K m value for this substrate 
was determined as 24.5 pM and the apparent k C3 JK m value was 
2359 M~’-19 s -1 . However, the dissociation of the catalytically 
active SARS-CoV M pro dimer into inactive monomers (Fan et al., 
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Table 1 

Data collection and refinement statistics. 


Inhibitor 

Ac-ESTLQ-H 

Ac-DSFDQ-H 

Ac-NSTSQ-H 

Ac-NSFSQ-H 

Ac-ESTLQ-H 

Cm-FF-H 

Data collection statistics 

Wavelenth (A) 

0.8123 

0.8123 

0.8123 

0.8123 

0.8123 

0.9184 

Method 

Soaking 

Soaking 

Soaking 

Soaking 

Cocrystallization 

Soaking 

Space group 

C2 

C2 

C2 

C2 

P2, 

C2 

Unit cell dimensions (A, °) 

108.91, 

108.74, 

107.83, 

107.87, 

52.37, 

107.75, 


81.36, 

81.86, 

82.46, 

82.09, 

96.79, 

82.88, 


53.40 

53.21 

53.32 

53.08 

68.07 

53.50 


P = 104.35 

P = 104.26 

P = 104.65 

P = 104.30 

P= 102.49 

P= 104.63 

NP a 

1 

1 

1 

1 

2 

1 

Resolution range (A) 

32.21-2.60 

32.32-2.40 

26.58-2.58 

64.56-3.05 

27.29-1.89 

32.82-1.99 

Number of reflections 

13.957 

16.009 

13.938 

8643 

50.901 

31.388 

Redundancy 5 

3.8(3.8) 

3.4(2.8) 

3.0(2.7) 

3.4(3.4) 

2.8(2.6) 

3.6(3.4) 

Completeness (%) b 

99.9(100.0) 

91.3(85.7) 

96.7(82.5) 

99.9(100.0) 

94.5(78.3) 

99.7(98.1) 

Emerge (%) b 

10.2(45.6) 

10.4(25.7) 

6.0(32.0) 

8.3(47.2) 

6.7(33.6) 

17.0(36.6) 

//o-(/) b 

8.7(2.4) 

5.8(2.3) 

10.1(2.8) 

10.2(2.3) 

10.3(2.7) 

4.6(2.2) 

Refinement statistics 

R/Rfree (%) 

18.5/22.5 

21.2/23.9 

19.8/26.0 

19.0/25.5 

18.7/23.8 

21.7/26.9 

r.m.s deviation from idea geometry 

Bonds (A) 

0.010 

0.009 

0.017 

0.015 

0.009 

0.018 

Angles (°) 

Ramachandran plot 

1.535 

1.162 

1.820 

1.661 

1.172 

1.689 

Most favored (%) 

88.0 

90.3 

83.9 

80.6 

90.6 

89.8 

Allowed (%) 

10.9 

8.6 

14.2 

18.3 

8.3 

9.1 

Generously allowed (%) 

0.4 

0.7 

1.1 

0.4 

0.4 

0.8 

Disallowed (%) 

0.7 

0.4 

0.7 

0.8 

0.8 

0.4 

PDB ID 

3SNE 

3SNB 

3SNC 

3SNA 

3SND 

3SN8 


a Number of protein molecules per asymmetric unit. 
b Numbers in parentheses are for the outermost resolution shell. 


2004) has to be taken into account. A wide range of 1< D values has 
been reported for M pro dimer dissociation (see Grum-Tokars et al. 
(2008) for an overview). For the enzyme with authentic chain ter¬ 
mini, the latter authors reported a I< D of 0.25 to 1.0 pM. As the M pro 
dimer tends to be stabilized by the presence of substrate (Cheng 
et al„ 2010), we used the lower limit of this range for estimation 
of the necessary corrections and obtained a k cM IK m value of 
7863 M _1 s -1 . The K m value determined in our study is of the same 
magnitude as other values derived from data not corrected for dis¬ 
sociation, which are typically in the range from 10 to 50 |±M 
(Grum-Tokars et al., 2008). This suggests that much higher values 
reported previously (e.g., Verschueren et al., 2008) should probably 
be reconsidered. 

Initially, four pentapeptide substrate-analogous aldehydes con¬ 
taining the canonical PI residue, glutamine, namely Ac-ESTLQ-H, 
Ac-NSTSQ-H, Ac-DSFDQ-H, and Ac-NSFSQ-H, were tested for inhi¬ 
bition of SARS-CoV M pro . Overall, the four analogues were found 
to be reversible tight-binding inhibitors. Harboring the canonical 
P2 Leu, aldehyde Ac-ESTLQ-H exhibited inhibition with a relatively 
low /(, of 8.27 ± 1.52 pM. Aldehydes Ac-NSTSQ-H, Ac-DSFDQ-H, 
and Ac-NSFSQ-H, all with a non-canonical P2 residue, exhibited 
moderate inhibition with /(, values of 40.98 ±2.63, 41.24 ±2.25, 
and 72.73 ± 3.60 pM. Surprisingly, aldehyde CmFF-H, carrying a 
cinnamoyl group in the P3 and a Phe residue in the PI position, 
had an even higher inhibitory activity against SARS-CoV M pl ° than 
the four pentapeptide aldehydes, with a K, of 2.24 ± 0.58 pM. 

3.2. Overall structures of the aldehyde complexes 

The aldehydes Ac-ESTLQ-H, Ac-NSTSQ-H, Ac-DSFDQ-H, 
Ac-NSFSQ-H, and Cm-FF-H were separately soaked into crystals 
of SARS-CoV Mpro. The crystals were all of space group C2, which 
is often observed for SARS-CoV M pro (Lee et al., 2005; Xue et al., 
2007; Verschueren et al., 2008). These crystals contain one SARS- 
CoV M pro monomer per asymmetric unit and the dimer (which is 
the enzymatically active species) is formed through the symmetry 


of the crystal. The four pentapeptide aldehydes Ac-ESTLQ-H, Ac- 
NSTSQ-H, Ac-DSFDQ-H, and Ac-NSFSQ-H are bound in extended 
conformations in the S6-S1 specificity subsites of SARS-CoV M pro . 
Cm-FF-H occupies sites S3-S1. Remarkably, the PI phenylalanine 
side chain of this inhibitor is bound deeply in the SI pocket, which 
is generally considered to be specific for glutamine. 2 F a - F c elec¬ 
tron density maps of these aldehyde inhibitors are shown in 
Fig. 1. In all complexes, continuous electron density between the 
aldehyde carbonyl C-atom of the inhibitor and Cysl45-Sy of 
SARS-CoV M pro (Fig. 1) indicates the formation of a thiohemiacetal, 
as a result of the nucleophilic attack of the catalytic cysteine onto 
the C-terminal aldehyde of the inhibitor. The main-chain confor¬ 
mations of SARS-CoV M pro in the five complexes are basically iden¬ 
tical, with overall root mean-square deviations (RMSD) of 
0.16-0.36 A for Cot atoms. In addition to the crystal soaking 
experiments, we also tried to cocrystallize SARS-CoV M pro with 
the aldehydes. The crystals obtained were predominantly of space 
group P2,. with the exception of the complex with Ac-NSFSQ-H, 
which still displayed space group C2. However, in most of the 
P2 t crystals, no electron density for the aldehyde could be 
observed; thus, while the presence of the inhibitors induced a 
change of space group in most cases, the compounds themselves 
were not detected in the crystals. An exception was Ac-ESTLQ-H. 
The crystals of its complex with the M pro diffracted to 1.89 A, but 
in both enzyme monomers in the asymmetric unit, electron den¬ 
sity could only be seen for residues PI (Gin) and P2 (Leu) of the 
inhibitor (Fig. IE and F). Crystallographic data and refinement 
statistics for all six crystal structures are summarized in Table 1. 

3.3. Dual configurations of the thiohemiacetal in the complex with Ac- 
NSFSQ-H 

In the SARS-CoV M pro complex with aldehyde inhibitor 
Ac-NSFSQ-H, the thiohemiacetal adopts alternative configurations 
(Fig. 2A and B). This is likely the result of nucleophilic attack by 
Cysl45 onto the planar carbonyl from either side. In the (R)-isomer, 
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Fig. 1. Inhibitor binding at the active site of SARS-CoV M pro . 2 F 0 - F c electron-density maps are shown for inhibitors (A) Ac-ESTLQ-H, (B) Ac-NSTSQ-H, (C) Ac-DSFDQ-H, (D) 
Ac-NSFSQ-H, (E) the visible portion of Ac-ESTLQ-H cocrystallized with the M pro , molecule A, and (F) molecule B, (G) Cm-FF-H. The maps are contoured at a level of la. The 
catalytic Cysl45 of SARS-CoV M pro forms a thiohemiacetal with the aldehyde group of the inhibitors. 

the thiohemiacetal oxygen forms a hydrogen bond with the imidaz- (S)- isomer, the oxygen atom points away from His41 and into the 

ole of His41 (3.30 A) of the catalytic dyad (Fig. 2C), whereas in the oxyanion hole, forming H-bonds with the main-chain amides of 
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Fig. 2. Dual configurations of the thiohemiacetal in the complex with Ac-NSFSQ-H. (A) and (B): Ac-NSFSQ-H binding at the active site in two alternative configurations. 
Fiydrogen bonding interactions are represented by broken lines. Distances between atoms are shown in A. (C) and (D): Schematic diagram of the aldehyde inhibitor Ac- 
NSFSQ-H attacked by the catalytic cysteine 145, leading to the formation of two possible diastereomeric products. 


Glyl43 (2.71 A) and Cys 145 (3.15 A) (Fig. 2D). Previous X-ray and 
NMR studies of aldehyde binding to various cysteine and serine 
proteases demonstrated that one or both configurations were pres¬ 
ent in the structures of the complexes (Delbaere and Brayer, 1985; 
Webber et al„ 1998; Robin et al., 2009). 

3.4. The S2 subsite 

The S2 subsite is a large pocket lined by residues His41, Met49, 
Cysl45, Hisl64, Metl65, Aspl87, Argl88mc (me: contribution 
from main-chain atoms only), and Gin 189 (Fig. 1). At the cleavage 
sites of the SARS-CoV M pro in the viral polyproteins, the P2 position 
is mostly found to be occupied by Leu or Phe. Accordingly, almost 
all inhibitors designed for the enzyme carry a large hydrophobic 
group in the P2 position. We were therefore very surprised to find 
the Ser or Asp side chains in the P2 position of the aldehydes bound 
to this hydrophobic site (Fig. 1). In the SARS-CoV M pro complexes 
with Ac-NSTSQ-H or Ac-NSFSQ-H, there is no hydrogen-bonding 
interaction between the side chain of P2-Ser and residues of the 
S2 subsite. As the Ser side chain is small, there is no steric barrier 
for Ser binding in the large S2 pocket. The serine does not quite 
reach the “bottom” of the S2 pocket, but the distance between 
its Oy atom and the carbonyl oxygen of residue Glnl92 is 4.0 A, 
leaving no space for a water molecule to locate in between. Accord¬ 
ingly, we did not find any electron density suggesting the presence 


of a well-ordered water in the S2 pocket, although disordered 
water is probably present in the pocket on both sides of the serine 
side-chain, where free spaces with a total volume of >70 A 3 are 
present between the substrate and the protein. It should be noted 
that the side-chain of Glnl89, which contributes to one wall of the 
S2 pocket, is quite flexible and adopts different conformations in 
the various inhibitor complexes, allowing or preventing the access 
of water to the pocket. 

In case of the SARS-CoV M pro complex with Ac-DSFDQ-H, the P2 
Asp side-chain is situated between the side chains of Met49 and 
Metl65, which form opposite walls of the S2 pocket. Most interest¬ 
ingly, its carboxylate oxygens undergo close interactions with the 
sulfur atoms of the two methionines. The distances (r s o) between 
the S8 atoms of Met49 and Metl65 on the one hand and the 052 
and 051 atoms, respectively, of P2-Asp on the other are 3.36 and 
3.45 A (Fig. 3). Then the relative distances (d s ...o) can be calculated 
according to d s 0 = r s 0 - vdw(S) - vdw(O), where values of 1.80 
and 1.52 A are used as van der Waals radii (vdw) for S and 0 atoms, 
respectively. With d s . o = 0.04 and 0.13 A, the distances between 
the P2-Asp and the two S2-Met side-chains fulfill the condition 
of d s o ^ 0.2 A (Iwaoka et al., 2002) for a non-bonded S. ..0 
contact. These interactions are not in the plane of the methionine 
sulfides; the 0 values (0 is the polar angle between the normal to 
the sulfide plane and the S.. .0 vector (Pal and Chakrabarti, 
2001)) are 30.6° and 52.8°. Similar nonbonded interactions 
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Fig. 3. Interactions of P2-Asp in the S2 subsite in the complex structure SARS-CoV 
M pro : AC-DSFDQ.-H. P2-Asp carboxylate oxygens interact with the sulfur atoms of 
Met49 and Metl65. These non-bonded interactions are represented by dashed 
lines. Distances between atoms are shown in A. 


between the methionine sulfur atoms and main-chain carbonyl 
oxygens or carboxylate side-chains have been detected previously 
in the hydrophobic cores of proteins and were proposed to stabi¬ 
lize the protein fold (Pal and Chakrabarti, 2001). It has also been 
suggested that S.. .0 interactions should be taken into account in 
protein engineering studies (Iwaoka et al., 2002; Pal and Chakrab¬ 
arti, 2001), but to the best of our knowledge, we provide here the 
first description of a methionine-carboxylate interaction in a pro¬ 
tein-ligand complex. The unexpected finding of Ser and Asp bind¬ 
ing in the S2 subsite constitutes a deviation from the dogma that 
peptide inhibitors of proteases should contain amino-acid residues 
corresponding to the sequence specificity of the target enzyme. 

3.5. Analysis of peptide substrates harboring different amino-acid 
residues in P2 

The unexpected observation of the P2-Asp residue of aldehyde 
Ac-DSFDQ-H binding to the hydrophobic S2 pocket prompted us 
to re-determine the cleavage specificity of the SARS-CoV M pro with 
respect to the P2 position. Within the context of the peptide sub¬ 
strate SWTSAVXQISGFRKWA, all 20 proteinogenic amino acids 
were tested in position P2 (=X). In the HPLC-based assay, the pro¬ 
tease concentration was kept constant at 0.5 pM. The canonical 
substrate with Leu in the P2 position was completely cleaved with¬ 
in two minutes and was therefore used as the standard to compare 
the proteolytic activities with other substrates. The results are 
listed in Fig. 4. Our study confirmed that the most favored residues 
in the P2 position of the substrate (after Leu) are the hydrophobic 
amino acids Phe, Met, Val, and lie. Substrates with polar amino 
acids in P2 are poorly cleaved. The cleavage efficiency for the sub¬ 
strate with P2 = Ser was 160-times lower than for the best sub¬ 
strate (P2 = Leu); furthermore, the peptide with P2 = Asp was not 
cleaved at all after incubation with SARS-CoV M pro for 16 h. 

3.6. SI subsite 

The SI subsite is lined by residues Phel40, Asnl42, Glyl43, 
Serl44, Cysl45, Hisl63, Glul66, and Hisl72. From the substrate 



Fig. 4. Relative cleavage efficiencies of 20 peptide substrates harboring different 
amino-acid residues in P2. 


specificity results (Fan et al., 2004, 2005; Lai et al., 2006), PI has 
to be Gin. Moreover, in all SARS-CoV M pro complex structures de¬ 
scribed so far (Lee et al., 2005; Xue et al., 2007; Yang et al., 2005, 
2003; Yin et al., 2007), the SI subsite is occupied by a polar group, 
i.e. the side chain of Gin or a five-membered lactam (used as a glu¬ 
tamine surrogate). Shie et al. synthesized a series of oe,p-unsatu- 
rated esters containing both PI and P2 phenylalanine residues 
which showed modest inhibitory activity against the SARS-CoV 
M pro (1C 50 = 11-39 pM) (Shie et al., 2005). The a,p-unsaturated 
ethyl ester of 4-(dimethylamino)cinnamoyl-Phe-Phe was reported 
to be a potent inhibitor, with an inhibition constant of 0.52 pM 
(Shie et al., 2005). The computer model for the complex of SARS- 
CoV M pro with the compound proposed that the PI and P2 phenyl 
groups occupy the S2 and S3 pockets, respectively (Shie et al., 
2005). However, in the crystal structure of the complex with the 
aldehyde Cm-FF-H that we describe in this communication, the 
side chain of PI-Phe is inserted into the SI subsite (Fig. 1G). In or¬ 
der to understand this unexpected deviation from the commonly 
observed specificity, we compared the interaction with the SI sub¬ 
site observed in our structures of the complexes with Cm-FF-H and 
Ac-ESTLQ-H, the latter of which corresponds exactly to the 



Fig. 5. Superimposition of the structures of SARS-CoV M pro complexes with Cm-FF- 
H (red) and AC-ESTLQ.-H (green) showing the conformational changes in the SI 
subsite. (For interpretation of the references to color in this figure legend, the reader 
is referred to the web version of this article.) 
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described cleavage specificity of the M pro . The predominant SI 
specificity of the enzyme for Gin is determined primarily by 
Hisl63. In the complex formed with Ac-ESTLQ-H, Ne2 of Hisl63 
donates a 2.83-A hydrogen bond to the side-chain carbonyl oxygen 
(Osl) of PI-Gin (Fig. 1). In the complex with Cm-FF-H, the side 
chain of Asnl42 undergoes an 83.8° rotation about its %1 torsion 
angle compared to the conformation in the complex with Ac- 
ESTLQ-F1 (Fig. 5), leading to an opening of the SI subsite towards 
the solvent. A similar movement of Asnl42 has been observed in 
the crystal structure of SARS-CoV M pro in complex with an inhibi¬ 
tor called “Nl” (PDB ID: 1WOF), where both the regular conforma¬ 
tion of this residue and the one observed in our structure were 
found (Yang et al., 2005). The PI-benzyl group of Cm-FF-H binds 
to the SI subsite by making hydrophobic interactions with 
Phel40, Leul41, Asnl42, and the P3 cinnamoyl group of Cm- 
FF-H. As the SI subsite specificity does not seem to be as stringent 
as previously thought. It may well be possible that other chemical 
groups can be accommodated here, thereby expanding the oppor¬ 
tunities for designing and synthesizing efficient inhibitors of the 
SARS-CoV M pro . 

4. Conclusions 

In this study, we report six crystal structures of SARS-Cov M pro 
in complex with peptide aldehydes and the inhibition kinetics of 
these compounds. The crystal structures reveal the mechanism of 
aldehyde binding to the active site of SARS-Cov M pro and provide 
detailed information on the atomic interactions. Since Asp or Ser 
were found to bind to the hydrophobic S2 subsite, and Phe was 
located in the hydrophilic SI subsite of SARS-Cov M pro , we con¬ 
clude that the stringent substrate specificity of the SARS-CoV M pro 
with respect to the PI and P2 positions can be overruled by the 
highly electrophilic character of the aldehyde warhead. This con¬ 
stitutes a deviation from the dogma that peptidic protease inhibi¬ 
tors should comprise an amino-acid sequence corresponding to the 
cleavage specificity of the target enzyme. The observed non- 
bonded interaction of the carboxylate oxygen atoms of an aspar¬ 
tate residue of an inhibitor with the thioether moieties of two 
methionines forming part of an hydrophobic pocket of the target 
protein is remarkable and suggests that such S.. .0 interactions 
should be added to the repertoire of computer-aided drug design. 

4.1. Accession numbers 

Atomic coordinates and structure factors have been deposited 
in the RCSB Protein Data Bank under ID codes 3SN8, 3SNA, 3SNB, 
3SNC, 3SND, and 3SNE (see Table 1). 
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